Analytical techniques for mapping multi-hazard with geo-environmental modeling approaches and UAV images

The quantitative spatial analysis is a strong tool for the study of natural hazards and their interactions. Over the last decades, a range of techniques have been exceedingly used in spatial analysis, especially applying GIS and R software. In the present paper, the multi-hazard susceptibility maps compared in 2020 and 2021 using an array of data mining techniques, GIS tools, and Unmanned aerial vehicles. The produced maps imply the most effective morphometric parameters on collapsed pipes, gully heads, and landslides using the linear regression model. The multi-hazard maps prepared using seven classifiers of Boosted regression tree (BRT), Flexible discriminant analysis (FDA), Multivariate adaptive regression spline (MARS), Mixture discriminant analysis (MDA), Random forest (RF), Generalized linear model (GLM), and Support vector machine (SVM). The results of each model revealed that the greatest percentage of the study region was low susceptible to collapsed pipes, landslides, and gully heads, respectively. The results of the multi-hazard models represented that 52.22% and 48.18% of the study region were not susceptible to any hazards in 2020 and 2021, while 6.19% (2020) and 7.39% (2021) of the region were at the risk of all compound events. The validation results indicate the area under the receiver operating characteristic curve of all applied models was more than 0.70 for the landform susceptibility maps in 2020 and 2021. It was found where multiple events co-exist, what their potential interrelated effects are or how they interact jointly. It is the direction to take in the future to determine the combined effect of multi-hazards so that policymakers can have a better attitude toward sustainable management of environmental landscapes and support socio-economic development.

One natural event may trigger or increase the probability of the occurrence of one or more other natural hazards [1][2][3] . For example, VanDine and Bovis 4 and Lucà et al. 5 suggested that landslides occur in gullies due to the volumetric sediment concentration. The authors also explained that debris flows could influence gully heads on a steeper slope. Meanwhile, Kukemilks and Saks 6 reported that landslides can occasionally form and develop on the gully bed, improving close relationships. Indeed, Landslides are defined by the down-slope movement of debris, rock, and soil mainly caused by land gravity. They are classified according to their movement type and composite materials 7,8 . Piping erosion divided into two groups, including closed depressions and sinkholes, is also defined as one hazard that has a complex interaction with gully heads 9 and landslides. Closed depressions initiated when the soil surface steadily lowered above a surface with no break in the vegetation cover 10 ; these can eventually change into sinkholes. Sinkholes developed when the surface soil material was plainly interrupted and collapsed 11 . Gully heads are also defined as a natural, mostly vertical fall of the gully channel wall 12 . They are often treated as independent or isolated. Further, there is a need for an alternative (a multi-hazard approach) to recognize all feasible natural events and their interrelationships.
The eventuality of happening collapsed pipes, gully heads, and landslides poses a substantial environmental and physical threat to the general public of arid and semi-arid regions 13 . Their measuring, modeling, and monitoring are not consistently economically or technically feasible; thus, quantitative susceptibility assessment considering "multi-hazard scenarios" and their probable outcomes is becoming a research question and developing a framework for future studies 14,15 . A multi-hazard chain is a series of events that happen in a successional Providing multi-hazard maps in 2020 and 2021. This section described the susceptibility maps created for three studied hazards using MARS, BRT, FDA, MDA, GLM, RF, and SVM. Figures 2 and 3 showed the susceptibility maps of three natural hazards (i.e., gully heads, collapsed pipes, and landslides) produced using the FDA, MARS, and GLM models in 2020 and 2021. As it is shown in these two figures, the percentage of moderate www.nature.com/scientificreports/ and high susceptibility classes for all models increased in 2021 compared to 2020. In contrast, the percentage of low and very high susceptibility classes decreased from 2020 to 2021. For example, according to the gully heads susceptibility maps analyzed by MARS in 2020, 70.52%, 6.12%, 5.51%, and 17.85% of the total area were classified into "low, moderate, high, and very high classes", respectively (Fig. 1). Although, susceptibility classes of gully heads created by the MARS classifier in 2021 were 65.76% (low), 13.30% (moderate), 9.61% (high), and 11.33% (very high) (Fig. 2). Regarding the landslide susceptibility map using GLM in 2020, the greatest percentage of the region (69.92%) had low susceptibility compared with moderate (8.86%), high (8.61%), and very high (12.54%) (Fig. 2). In 2021, the percentage of low (67.21%) and very high (11.98%) landslide susceptibility classes decreased, although an increase was observed in moderate (10.63%) and high (10.11%) susceptibility classes. Based on the collapsed pipes susceptibility map results in 2020, 36.12%, 20.23%, 22.74%, and 20.90% of the region were classified into "low, moderate, high, and very high classes", respectively (Fig. 1). In 2021, low (38.27%) and very high (22.80%) classes were enhanced, while moderate (18.44%) and high (20.49%) classes were reduced (Fig. 2).
To prepare a multi-hazard map in 2020 and 2021, three higher accuracy susceptibility maps (one for each hazard) in 2020 and three higher accuracy susceptibility maps ((one for each hazard)) in 2021 were considered together (Fig. 3). The joint map in the form of a multi-event in 2020 and 2021 disclosed that most of the regions under study are not susceptible to any natural hazards, although a few percentages of regions are at hazardous risk of three natural events when analyzed jointly. Indeed, the results of the compound events analysis in 2020 and 2021 (Fig. 3) showed that 52.22% (in 2020) and 48.18% (in 2021) of the study area are not clearly susceptible to compound events, whereas 6.13% (in 2020) and 7.39% (in 2021) of the study area are susceptible to combined hazard of collapsed pipes, gully heads, and landslides (Fig. 4). The susceptibility classes of the compound events    www.nature.com/scientificreports/ slides susceptibility maps in 2020, respectively. While, the MARS, RF, and BRT had the highest accuracy (excellent with more than 0.9%) for gully heads, collapsed pipes, and landslides susceptibility maps of 2021, respectively (Tables 1, 2 (Tables 7, 8, and 9). In the other words, the SVM and MARS models had excellent accuracy, especially regarding mapping gully heads, due to the strength of the model in data-driven aspects and non-linearity. Besides the low cost of the algorithm construction, it makes them exceedingly usable for predicting and assessing dynamic factors (i.e., anthropogenic or hydrological information). Other classifiers such as GLM and BRT, were more capable for prediction of landslides in 2020 and 2021, selected in this study because they consider nonlinear correlations among independent and dependent factors. The spatial accuracy of all models was more than 90% (excellent) for gully heads and landslides in 2020 and 2021, while it was very good (more than 80%) for collapsed pipes in the two studied years. Furthermore, all applied classifiers had excellent accuracy for predicting gully heads, collapsed pipes, and landslides in 2020 and 2021.

Discussion
Multicollinearity analysis. This study investigated the corresponding factors of environmental variables prone to combinations of collapsed pipes, gully heads, and landslides using a linear regression model that allows comparing different events in one site. According to multicollinearity analysis, the first or second statistically standardized beta value among environmental covariables was recognized between collapsed pipes/gully heads and land use. Hosseinalizadeh et al. 11 described that land use affects hydrological processes, which is erosional landform. These processes also affect subsurface water accumulation and make the material dissolve on the hardpan layer and eventually create sinkholes. Hydrological processes can also lead to the connection of a single sinkhole, which can lead to the formation of blind gullies that will eventually connect to the drainage network and form gully heads and gully network. In particular, an overland flow that is more than the substrate or soil infiltration capacity can be very adequate for initiating erosional processes (i.e., surface and inter-rill erosion)  www.nature.com/scientificreports/ and slopes developing in regions with sparse land cover. It is exceedingly represented that the water and soil managers and landowners are forcefully linked. Moreover, the highest standardized beta value in analyzing gully heads/landslides was slope degree. Our fieldwork experiences showed that steeper slopes may increase a region's vulnerability to massive movements, including landslide forms. The steepness of slopes predominantly stimulates runoff velocity to increase through time, resulting in a vertical collapse in a gully channel bed or sudden displacement of sediments down the slope. Regarding slope, gully heads and landslides happen in the steep slopes of the central parts of the study area as stream density plays a significant role downstream of these slopes. The significant positive effects of slope and stream density on the occurrence of landslides have been reported by Hua et al. 44 .
Multi-hazard maps produced using seven classifiers in 2020 and 2021. The main benefit of machine learning classifiers compared to other statistical-based methods is that they can easily solve the prob-  www.nature.com/scientificreports/ lem of noises in the dataset and are highly accurate in the confined measurement errors or the existence of uncertain data. However, seven classifiers and high accuracy imagery techniques were used to solve this problem. Well-documented advantages make the data mining classifiers, especially RF, which had the highest accuracy for predicting collapsed pipes in 2020, appropriate for monitoring the changes in combination with natural hazards. First, the applied classifiers are fast and straightforward, defined by great prediction performance 45 . For example, the RF algorithm provides an internally even-handed assessment of generalizability with a precise classifier in forest building and thus, prepares higher quality forecasts robustness 27 . One of the best achievements of the present study regarding the studied combined hazards is the preparation of multi-hazard susceptibility maps in 2020 and 2021, which could very accurately predict the region susceptible to compound events, and thus, help us to focus our future researches here. Skilodimou et al. 46 combined the statistical-based map of earthquakes, floods, and landslides and provide a single multi-hazard probability map. Yanar et al. 47 provided a multi-hazard map of landslides and floods using the Mamdani fuzzy algorithm. Pourghasemi et al. 48 studied the spatial behavior of compound events of flood, forest fire and landslide in Shiraz City, Iran and provided a multi-hazard map of interested hazards using the RF model. The multi-hazard susceptibility maps modeled in 2020 revealed that the places where all three hazards occurred together were less, but in 2021, this amount has increased. Thus, what is essential is the identification of factors or processes causing increasing these hazard occurrences at specific locations within the study area. Otherwise, a year should not impact increasing the occurrences of three combined hazards together.
Besides, collapsed pipes are the most dangerous hazard, followed by landslides and gully heads (see Fig. 4). Perhaps, the transformation of each of these hazards into one another leads to these results over one year. It means that the number of natural events that may occur cumulatively or simultaneously increases over time.
The consequences of such event relationships occurring mean that an effect is generated which is a little bit different from that of the individual events occurring in isolation. This can lead to remarkable challenges, particularly for operators of national networks and/or asset managers. In other words, hazards are mostly behaved as independent or isolated. A multi-hazard alternative approach seeks to recognize all feasible natural events and their interrelationships. One natural event increase or triggers the possibility of one or more other events. For instance, a collapsed pipe may trigger gully heads, whereas a gully head may enhance the possibility of landslides being created soon.
Considering compound hazards and using the systematic collection of input maps, statistical classifiers, and high-resolution imagery tools can support adaptive management. From this perspective, multi-hazards www.nature.com/scientificreports/ assessment can increase our knowledge of Earth's internal and external processes, monitoring and forecasting of natural events and possible consequences when they occur coincidentally. They can clearly define the triggering hazardous landforms and their impacts on another due to various Earth surface and sub-surface processes that operated over short or long geological times. Further, finding how one event can increase or trigger the probability of a secondary event is essential to mitigating and investigating multi-hazard phenomena. It may force us to apply hazard interaction classifiers or matrices to realize links between natural events and describe potential cascading hazards based on scientific knowledge.

Conclusion
Quantitative geomorphology is modeling and measuring the landform processes that shape the Earth's surface. Adopting the best controlling and managing plan for water and soil conservation will be possible whenever landforms and processes are also considered and recognized carefully. For example, collapsed pipes, gully heads, and landslides are responsible for considerable soil losses in arid and semi-arid regions and may not always be sufficient to monitor the effects of multi-hazards phenomena separately. In this study, the compound events susceptibility maps indicated damage resulting from compound events are enhanced in 2021 in comparison with the previous year, both in terms of occurrence and magnitude. Whereas the percentage of the region with no-hazards is decreasing as well. According to the AUC values, the best accuracy was specified for the GLM model for the prediction of landslide susceptibility map in 2020. Further, we attempt to examine multiple hazards instead of single approaches to achieve information on multi-hazard regions. Further, to be able to maintain and improve the sustainability of the environment and predict and reduce the effect of contemporary land surface and subsurface processes that lead to hazardous natural events (such as collapsed pipes, gully heads, and landslides), it is needed to answer the question of when and how big natural processes can be and where these compound events could occur. Besides, the effectiveness of the applied methods has been verified by the use of several statistical parameters and it resulted in quite good performances. Thus, the application of these methods is also suggestable for conservation purposes at national scales in another contexts, including in earthquake-and flood-prone areas.

Methodology
The data mining classifiers were used to analyze and to evaluate the spatial prediction of multi-hazards. The methodology, approach, and its components are shown in Fig. 6. The flowchart contains four main steps, including (1) data preparation in 2020 and 2021, i.e., obtaining the location of 281 collapsed pipes, 152 landslides, and 90 gully heads in 2020 and also 410 collapsed pipes, 328 landslides, and 198 gully heads in 2021 gathering in the field and unmanned aerial vehicles (UAV); (2) identification of the most hazardous environmental covariables come up with the occurrence of collapsed pipes, gully heads, and landslides using the linear regression algorithm; (3) spatial modeling of the collapsed pipes, gully heads, and landslides susceptibility along with validation processes applying seven data mining models; and, eventually, (4) preparing and comparing multi-hazards maps in 2020 and 2021. All statistical classifiers are exhaustively described in the researches cited below, hence, only a concise explanation is given here.
The study area was selected from the tributaries of Gorganrood with the loess-driven soils located in the east part of "Golestan province". The maximum and minimum altitudes are approximately 548 and 208 m above sea level. The climate of semi-arid using the Domarten method, and the average annual rainfall is 260 mm reported by the Iran Meteorological Agency in 2021. The "silt loam" is the texture of the topsoil. The location map of the study region in Iran (a), Golestan province (b), and the study region (c) is shown in Fig. 7  www.nature.com/scientificreports/ of three landforms (collapsed pipes, gully heads, and landslides) showed in Fig. 8. Both overgrazing and intense farming contribute to developing the different types of soil erosion in this region. Besides, between the two calendar years, 2020 and 2021, intensive rainfall occurred for 7 days and nights in 03.17.2020 and ended on 03.24.2020. According to the "State Meteorological Agency", the mean total rainfall was about 220 mm across the "Gorganrood watershed", but four or five stations recorded up to 380 mm. That's why, the numbers of each event increased, suddenly.
Gathering data related to collapsed pipes, gully heads, and landslides hazards. The comprehensive mapping was done to recognize collapsed pipes, gully heads, and landslides. The spatial location of these three natural hazards was recorded applying UAVs (Sensefly eBee x) in 2020 and 2021. Unmanned aerial vehicles were applied with a camera model named Sensefly Aeria X and focal length of 18.5 mm. The average Ground Sampling Distance (GSD) was 5 cm. The orthomosaic resolution obtained was 5 cm/pixel. The trained UAV operator and the predefined flight paths were autonomously obtained by senseFly eMotion flight planning software. The flight paths were designed to have a front overlap of 85% and side overlap of 70%. The craft maintained a flight path 220 m above the surface. Aerial images acquired from the survey were processed with the Pix4Dmapper photogrammetry software. The gathered data were applied to re-check the exact positions of these natural events mapped during intensive fieldwork. The susceptibility mapping and monitoring tools used in 2020 and 2021 obtained the samples of non-hazards and three compound events to construct susceptibility maps and their evaluation. Of the total recorded compound events in the study area, 70% were applied in the basic phases of the model-building process, and the remaining were carried out in the validation phases in 2020 and 2021. Natural Breaks classification as an excellent method in ArcMap was used to classify the hazard/ susceptibility maps 49 . Landslides are affected by a collection of anthropological (land use and road distance) and geo-environmental "(plan curvature, distance to streams, altitude, drainage density, profile curvature, topographic wetness index, and slope degree)" variables 44,45 . In other words, to assess the multicollinearity analysis between morphometric parameters and these three compound events, 9 (landslides), 11 (collapsed pipes), and 11 (gully heads) were chosen to add as independent variables. Topographical variables were extracted applying a "UAV-digital elevation model" (orthophoto images driven from the UAV) with a resolution of 1 m. Data layers were created applying ArcGIS 10.3.1 (https:// www. esri. com). The distance to stream and distance to road maps were obtained from the roads and river maps. The land use map was also created from the orthophoto images. Finally, the number of 60 soil samples (the same sample size for each study landform and also for locations with no hazard; 15) was gathered in the field and transferred to the soil laboratory (detailed in Table 10).
Multicollinearity analysis. The linear regression algorithm was used to detect the multicollinearity of environmental covariables affecting three compound events. This fitting algorithm analyzes the interaction between the response variable (the absence or presence of gully heads, collapsed pipes or landslides) and independent variables  . The model's output represents the significantly (Sig) and coefficient calculated by SAS 16.0 software. By doing linear regression, "Variance Inflation Factor (VIF) and Tolerance (TOL)" were obtained to detect the multicollinearity among the corresponding factors and to define a noise which reduces the accuracy of the final model. Particularly, if TOL < 0.1 and VIF > 10, the predictor variable is multicollinear and should be omitted from the further modelling processes 52 .  www.nature.com/scientificreports/ Support vector machine (SVM). It divides different classes with an optimal hyperplane and thus, maximizes the spatial border between them. The points nearest to the hyperplane are named "support vectors" (the main components of the training dataset). These decision rules are performed by solving a quadratic optimization quandary solely. Further, the use of the classification concept separates classes and maximizes their spatial border 53 . This hyperplane with a higher spatial border has better generalization and is stable to noise. The SVMs, as a multi-layer perceptron 54 remark a set of linear detachable training vectors and find an n-dimensional hyperplane and become different in the process of "two classes by their maximum gap" 55 . The function of the SVM response is described in Table 11.
Mixture discriminant analysis (MDA). It combines more assembled neural network classifiers due to the "nonlinear nature of its classification rules". Due to its modest structure, the MDA also makes straightforward interpretations in conjunction with linear mixture classifiers. The MDA proposes a new alternative for making three-dimensional models question or addressing ensemble modeling in remote sensing, typically from aircraft satellites 56,57 . As an advantage of MDA, we can add that it performs a mixture of classifiers with estimation using the Maximum Likelihood and expectation-maximization algorithm 58 .

Flexible discriminant analysis (FDA).
It is a statistical analysis applying a discriminant function to assign data to one or more groups or to a non-parametric version differing in certain respects from an earlier one. It uses an optimal scoring method to post-processed a multi-response regression. In other words, the FDA procedure is one of the best classifiers for optimal data-scoring to illustrate the classes and canonically spatial correlation analysis. The FDA is equal to administering a linear discriminant analysis. Semi-parametric or nonparametric regression is replaced instead of the linear regression steps. Furthermore, the algorithm uses different regression tools and produces various class boundaries and distinction rules 59,60 .

Random forest (RF).
It is a supervised classifier 61 with relatively low error compared to previous classification classifiers and various decision trees. To split each node in this algorithm, the dummy codes use the minimum node dimension, and features 62 , although the predictor is the top splitter in all trees if one of the variables has more impact on the predicting function. Further, averaging predictions from correlated trees does not significantly reduce assumed variance, and all trees are constructed in a similarly correlated pattern 61 .

Boosted regression tree (BRT).
It is an ensembled statistical-based algorithm that modifies an individual model by fitting and incorporating classifiers for the best prediction 33 . It is also combined boosting builds join with regression classifiers to reduce the amount of variance in a prediction and improve model accuracy. It makes the best or most effective use of the number of trees which set by inner bag fraction, cross-validation, and learning rate with no direct human control 34 . Eventually, the highest weakness of single tree classifiers (poor predictive implementation) may easily be solved by fitting multiple trees and rules of thumb 63 .

Multivariate adaptive regression spline (MARS).
It is a statistical-based algorithm for fitting nonlinear complex communications between independent and dependent variables while providing an interpretable spatial model 38 . It works by splitting up the ranges of the expository parameters into areas by producing a linear regression function. The Breaks numbers between areas are named "knots", although the term "basis function" implies each separate linear interval of the predictors. The basic equations as the response functions and the overall expression of MARS are described in Table 11 64 .
Generalized linear model (GLM). It linearly estimates the association between an event probability (collapsed pipes, landslides, or gully heads here) and predictors. It is also providing the possibility of an undemanding interpretation of coefficients. While considering a linear relationship between a predictor and its response is relatively unreal in some spatial modeling and may restrict predictive performance. The GLM can also assume nonlinear statistics by fitting nearly exact alteration functions to the predictors 42 . It has been illustrated as exceedingly beneficial in natural hazards, especially landslide modeling 41-65 . Evaluation of susceptibility maps. The "area under the receiver operating characteristic curve, or ROC curve", illustrates the capability of classifiers (BRT, GLM, FDA, MDA, MARS, RF, and SVM in this study) to predict the susceptibility of the study area to collapsed pipes, gully heads, and landslides. The ROC curve indicates the tradeoff between the two rates. The values of this function reflect the accuracies in a range from poor to excellent 66 .